DROPS

Document

DOI: 10.4230/LIPIcs.CONCUR.2020.21

Model-Free Reinforcement Learning for Stochastic Parity Games

Authors: Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak

Published in: LIPIcs, Volume 171, 31st International Conference on Concurrency Theory (CONCUR 2020)

Abstract

This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 1 1/2-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions.

Cite as

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Model-Free Reinforcement Learning for Stochastic Parity Games. In 31st International Conference on Concurrency Theory (CONCUR 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 171, pp. 21:1-21:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{hahn_et_al:LIPIcs.CONCUR.2020.21,
  author =	{Hahn, Ernst Moritz and Perez, Mateo and Schewe, Sven and Somenzi, Fabio and Trivedi, Ashutosh and Wojtczak, Dominik},
  title =	{{Model-Free Reinforcement Learning for Stochastic Parity Games}},
  booktitle =	{31st International Conference on Concurrency Theory (CONCUR 2020)},
  pages =	{21:1--21:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-160-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{171},
  editor =	{Konnov, Igor and Kov\'{a}cs, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2020.21},
  URN =		{urn:nbn:de:0030-drops-128332},
  doi =		{10.4230/LIPIcs.CONCUR.2020.21},
  annote =	{Keywords: Reinforcement learning, Stochastic games, Omega-regular objectives}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2017.35

Model Checking Omega-regular Properties for Quantum Markov Chains

Authors: Yuan Feng, Ernst Moritz Hahn, Andrea Turrini, and Shenggang Ying

Published in: LIPIcs, Volume 85, 28th International Conference on Concurrency Theory (CONCUR 2017)

Abstract

Quantum Markov chains are an extension of classical Markov chains which are labelled with super-operators rather than probabilities. They allow to faithfully represent quantum programs and quantum protocols. In this paper, we investigate model checking omega-regular properties, a very general class of properties (including, e.g., LTL properties) of interest, against this model. For classical Markov chains, such properties are usually checked by building the product of the model with a language automaton. Subsequent analysis is then performed on this product. When doing so, one takes into account its graph structure, and for instance performs different analyses per bottom strongly connected component (BSCC). Unfortunately, for quantum Markov chains such an approach does not work directly, because super-operators behave differently from probabilities. To overcome this problem, we transform the product quantum Markov chain into a single super-operator, which induces a decomposition of the state space (the tensor product of classical state space and the quantum one) into a family of BSCC subspaces. Interestingly, we show that this BSCC decomposition provides a solution to the issue of model checking omega-regular properties for quantum Markov chains.

Cite as

Yuan Feng, Ernst Moritz Hahn, Andrea Turrini, and Shenggang Ying. Model Checking Omega-regular Properties for Quantum Markov Chains. In 28th International Conference on Concurrency Theory (CONCUR 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 85, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.CONCUR.2017.35,
  author =	{Feng, Yuan and Hahn, Ernst Moritz and Turrini, Andrea and Ying, Shenggang},
  title =	{{Model Checking Omega-regular Properties for Quantum Markov Chains}},
  booktitle =	{28th International Conference on Concurrency Theory (CONCUR 2017)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-048-4},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{85},
  editor =	{Meyer, Roland and Nestmann, Uwe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2017.35},
  URN =		{urn:nbn:de:0030-drops-77818},
  doi =		{10.4230/LIPIcs.CONCUR.2017.35},
  annote =	{Keywords: Quantum Markov chains, model checking, omega-regular properties, bottom strongly connected component}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2015.354

Lazy Probabilistic Model Checking without Determinisation

Authors: Ernst Moritz Hahn, Guangyuan Li, Sven Schewe, Andrea Turrini, and Lijun Zhang

Published in: LIPIcs, Volume 42, 26th International Conference on Concurrency Theory (CONCUR 2015)

Abstract

The bottleneck in the quantitative analysis of Markov chains and Markov decision processes against specifications given in LTL or as some form of nondeterministic Büchi automata is the inclusion of a determinisation step of the automaton under consideration. In this paper, we show that full determinisation can be avoided: subset and breakpoint constructions suffice. We have implemented our approach - both explicit and symbolic versions - in a prototype tool. Our experiments show that our prototype can compete with mature tools like PRISM.

Cite as

Ernst Moritz Hahn, Guangyuan Li, Sven Schewe, Andrea Turrini, and Lijun Zhang. Lazy Probabilistic Model Checking without Determinisation. In 26th International Conference on Concurrency Theory (CONCUR 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 42, pp. 354-367, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{hahn_et_al:LIPIcs.CONCUR.2015.354,
  author =	{Hahn, Ernst Moritz and Li, Guangyuan and Schewe, Sven and Turrini, Andrea and Zhang, Lijun},
  title =	{{Lazy Probabilistic Model Checking without Determinisation}},
  booktitle =	{26th International Conference on Concurrency Theory (CONCUR 2015)},
  pages =	{354--367},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-91-0},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{42},
  editor =	{Aceto, Luca and de Frutos Escrig, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2015.354},
  URN =		{urn:nbn:de:0030-drops-53918},
  doi =		{10.4230/LIPIcs.CONCUR.2015.354},
  annote =	{Keywords: Markov decision processes, model checking, PLTL, determinisation}
}

Search Results

Documents authored by Hahn, Ernst Moritz

Model-Free Reinforcement Learning for Stochastic Parity Games

Abstract

Cite as

Model Checking Omega-regular Properties for Quantum Markov Chains

Abstract

Cite as

Lazy Probabilistic Model Checking without Determinisation

Abstract

Cite as

Thanks for your feedback!

Could not send message